-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement refscan
-based real-time referential integrity checking on /metadata/json:validate
#835
Implement refscan
-based real-time referential integrity checking on /metadata/json:validate
#835
Conversation
Previously, we were checking it after inserting documents into _each_ specified collection, which made it so we would not know whether a referenced document would have been inserted into a later collection.
…time-referential-integrity-validation-in-runtime
The failing test involves the use of the string " For now, I'll update the referential integrity checking code to ignore incoming collections that are not lists. |
I don't know the use case here. It is a use case that the preexisting validation stages allowed for. I do not see any documentation about it.
In this branch, I implemented real-time referential integrity checking on the
/metadata/json:validate
API endpoint. I also added automated tests that target thevalidate_json
function.Details
Real-time referential integrity checking
Previously, the
validate_json
function — the function that underlies the/metadata/json:validate
API endpoint — did not validate inter-document references.On this branch, I updated that function so that it does validate inter-document references... if the caller "opts into" that by setting the newly-introduced
check_inter_document_references: bool = False
parameter (to thevalidate_json
function) toTrue
.Based upon recent conversations with stakeholders, I "opted in" only the
/metadata/json:validate
API endpoint to this new validation. I did not "opt in" the/metadata/json:submit
API endpoint. That is so people can continue to run JSON-submitting code (i.e. API clients) that create referrer documents before creating the referree documents, a sequence of events that results in the database not having full referential integrity for a period of time.The core referential integrity checking is done by a function
import
-ed from the refscan PyPI package. On this branch, I have addedrefscan
torequirements/main.in
(followed by running$ make update-deps
).Tests targeting the
validate_json
functionPreviously, there were no unit tests targeting the
validate_json
function. I added some on this branch.In addition to targeting the original behavior of the function, which did not check inter-document references; the tests also target the new behavior of the function, where it checks inter-document references.
Related issue(s)
#831
Related subsystem(s)
docs
directory)Testing
I tested these changes by implementing and running unit tests that target them.
Documentation
docs
directory)Maintainability
study_id: str
)# TODO
or# FIXME
black
to format all the Python files I created/modified